Online learning for audio clustering and segmentation

نویسندگان

  • Alberto Bietti
  • Arshia Cont
  • Francis Bach
چکیده

Audio segmentation is an essential problem in many audio signal processing tasks which tries to segment an audio signal into homogeneous chunks, or segments. Most current approaches rely on a change-point detection phase for finding segment boundaries, followed by a similarity matching phase which identifies similar segments. In this thesis, we focus instead on joint segmentation and clustering algorithms which solve both tasks simultaneously, through the use of unsupervised learning techniques in sequential models. Hidden Markov and semi-Markov models are a natural choice for this modeling task, and we present their use in the context of audio segmentation. We then explore the use of online learning techniques in sequential models and their application to real-time audio segmentation tasks. We present an existing online EM algorithm for hidden Markov models and extend it to hidden semi-Markov models by introducing a different parameterization of semi-Markov chains. Finally, we develop new online learning algorithms for sequential models based on incremental optimization of surrogate functions. Résumé Le problème de la segmentation audio, essentiel dans de nombreuses tâches de traitement du signal audio, cherche à décomposer un signal audio en courts segments de contenu homogène. La plupart des approches courantes en segmentation sont basées sur une phase de détection de rupture qui trouve les limites entre segments, suivie d’une phase de calcul de similarité qui identifie les segments similaires. Dans ce rapport, nous nous intéressons à une approche différente, qui cherche à effectuer les deux tâches – segmentation et clustering – simultanément, avec des méthodes d’apprentissage non supervisé dans des modèles séquentiels. Les modèles de Markov et de semi-Markov cachés sont des choix naturels dans ce contexte de modélisation, et nous présentons leur utilisation en segmentation audio. Nous nous intéressons ensuite à l’utilisation de méthodes d’apprentissage en ligne dans des modèles séquentiels, et leur application à la segmentation audio en temps réel. Nous présentons un modèle existant de online EM pour les modèles de Markov cachés, et l’étendons aux modèles de semi-Markov cachés grâce à une nouvelle paramétrisation des chaines de semi-Markov. Enfin, nous introduisons de nouveaux algorithmes en ligne pour les modèles séquentiels qui s’appuient sur une optimisation incrémentale de fonctions surrogées.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The effects of segmentation and redundancy methods on cognitive load and vocabulary learning and comprehension of English lessons in a multimedia learning environment

The present study was conducted with the aim of the effects of segmentation and redundancy methods on cognitive load and vocabulary learning and comprehension of English lessons in a multimedia learning environment.The purpose of this study is an applied research and a real experimental study. The statistical population of the present study includes all people aged 14 to 16 who are enrolled in ...

متن کامل

Cluster-Based Image Segmentation Using Fuzzy Markov Random Field

Image segmentation is an important task in image processing and computer vision which attract many researchers attention. There are a couple of information sets pixels in an image: statistical and structural information which refer to the feature value of pixel data and local correlation of pixel data, respectively. Markov random field (MRF) is a tool for modeling statistical and structural inf...

متن کامل

Automatic Segmentation and Summarization of Spoken Lectures

The ever-increasing number of online lectures has created an unprecedented opportunity for distance learning. Most online lectures are presented as unstructured text, audio and/or video files which make it di cult for students to locate relevant lectures and browse through them. In this thesis, we investigated several automatic lecture segmentation and summarization algorithms. Automatic lectur...

متن کامل

Vodcast: A Breakthrough in Developing Incidental Vocabulary Learning

Incidental vocabulary learning is often seen as superior to direct instruction on many occasions. Meanwhile, upon the emergence of the World Wide Web, second language (SL) learners have been introduced to 'podcasts' (recorded audio and video online broadcasts) which could be authentic sources of vocabulary learning. The relatively recent phenomenon of video podcast (vodcast) might be considered...

متن کامل

Yardsticks for Evaluating ELT Pod/Vodcasts in Online Materials Development and Their Implications for Teacher Education and Art Assisted Language Learning

ELT online materials development, which is a multifaceted multidisciplinary area, is not welcomed by many teachers, because it is demanding, challenging and confusing. They fear facing new technologies in their teaching sessions to avoid failing or being caught by other audiences. Furthermore, they struggle hard in evaluating their pod/vodcasts. In order to remove the fears and barriers, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015